AITopics | great expectation

Collaborating Authors

great expectation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Best prompts to get the most out of an AI chatbot

FOX NewsAug-21-2023, 19:00:28 GMT

Many people are turning to chatbots as search engines. Kurt "The CyberGuy" Knutsson explains how to use them. People love using AI chatbots to assist them with tasks or to simply answer a question that they don't know the answer to. However, a chatbot can only answer to the best of its ability, and we have to also do our part to help it answer our questions as accurately as possible. CLICK TO GET KURT'S FREE CYBERGUY NEWSLETTER WITH SECURITY ALERTS, QUICK TIPS, TECH REVIEWS AND EASY HOW-TO'S TO MAKE YOU SMARTER If you don't get specific enough with how you present your prompts, a chatbot could give you a generic or even incorrect answer.

chatbot, cyberguy, information, (13 more...)

FOX News

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)

Add feedback

How to Test PySpark ETL Data Pipeline

#artificialintelligenceDec-6-2022, 15:40:13 GMT

Garbage in garbage out is a common expression used to emphasize the importance of data quality for tasks such as machine learning, data analytics and business intelligence. With increasing amount of data being created and stored, building high quality data pipelines have never been more challenging. PySpark is a commonly used tool to build ETL pipelines for large datasets. A common question that arises while building data pipeline is "How do we know that our data pipeline is transforming the data in the way that is intended?". To answer this question, we borrow the idea of unit test from the software development paradigm.

data pipeline, expectation suite, pipeline, (11 more...)

#artificialintelligence

Technology:

Information Technology > Data Science > Data Integration (0.61)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.61)
Information Technology > Data Science > Data Quality (0.56)

Add feedback

Hopsworks 3.0: The Python-Centric Feature Store

#artificialintelligenceAug-11-2022, 10:06:22 GMT

Feature stores began in the world of Big Data, with Spark being the feature engineering platform for Michelangelo (the first feature store) and Hopsworks (the first open-source feature store). Nowadays, the modern data stack has assumed the role of Spark for feature stores - feature engineering code can be written that seamlessly scales to large data volumes in Snowflake, BigQuery, or Redshift. However, Python developers know that feature engineering is so much more than the aggregations and data validation you can do in SQL and DBT. Dimensionality reduction, whether using PCA or Embeddings, and transformations are fundamental steps in feature engineering that are not available in SQL, even with UDFs (user-defined functions), today. Over the last few years, we have had an increasing number of customers who prefer working with Python for feature engineering.

feature group, hopswork, pipeline, (13 more...)

#artificialintelligence

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.96)

Add feedback

Data Quality

#artificialintelligenceAug-1-2022, 20:10:11 GMT

You can bet that you will be asked what kind of data issues you might encounter in your day job during one of your data engineer or data scientist interviews. Data quality will do more for model performance than any other technique. You could train a complicated deep learning model on massive amounts of data, but if the underlying data is bad, so too will the model's inference. In this article, we will attempt to address common data quality issues. As mentioned in the Kaggle tutorial on handling missing values, we need to distinguish between values that are missing because they were not recorded and values that are missing because they don't exist.

data quality, dataset, make sense, (7 more...)

#artificialintelligence

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.56)

Add feedback

Building an Open Source ML Pipeline: Part 2

#artificialintelligenceMay-31-2022, 16:56:03 GMT

In order to set up event-driven workflows we need to add another tool to our toolkit, namely Argo Events. Setting up Argo Events can be a bit tricky, but I provide the necessary YAML files to do so on Github here. We will start off with an example from their examples just to make sure that everything is installed correctly. So once you have cloned the repository feel free to take a look at the contents of these manifests. As far as modifications go, you need to replace the base64 encoded credentials in'minio-secret.yaml'

argo event, event source, open source ml pipeline, (9 more...)

#artificialintelligence

Technology:

Information Technology > Software (0.40)
Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback

Reducing Pipeline Debt With Great Expectations

#artificialintelligenceApr-29-2022, 15:15:17 GMT

This article was first published on Neptune AI's blog. You are a part of a data science team at a product company. Your team has a number of machine learning models in place. Their outputs guide critical business decisions, as well as a couple of dashboards displaying important KPIs that are closely watched by your executives day and night. On that fatal day, you had just brewed yourself a cup of coffee and were about to begin your workday when the universe collapsed. Everyone at the company went crazy. The business metrics dashboard was displaying what seemed to be random numbers (except every full hour, when the KPIs look okay for a short time) and the models were predicting the company's insolvency looming fast. What is worse, every attempt to resolve this madness resulted in your data engineering and research teams reporting new broken services and models. That was the debt collection day and the unpaid debt was of the worst kind: pipeline debt.

data pipeline, expectation suite, great expectation, (16 more...)

#artificialintelligence

Country: North America > United States > Illinois > Cook County > Chicago (0.04)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.91)

Add feedback

Memory and Knowledge Augmented Language Models for Inferring Salience in Long-Form Stories

Wilmot, David, Keller, Frank

arXiv.org Artificial IntelligenceSep-14-2021

Measuring event salience is essential in the understanding of stories. This paper takes a recent unsupervised method for salience detection derived from Barthes Cardinal Functions and theories of surprise and applies it to longer narrative forms. We improve the standard transformer language model by incorporating an external knowledgebase (derived from Retrieval Augmented Generation) and adding a memory mechanism to enhance performance on longer works. We use a novel approach to derive salience annotation using chapter-aligned summaries from the Shmoop corpus for classic literary works. Our evaluation against this data demonstrates that our salience detection model improves performance over and above a non-knowledgebase and memory augmented language model, both of which are crucial to this improvement.

computational linguistic, evaluation, salience, (14 more...)

arXiv.org Artificial Intelligence

2109.03754

Country:

Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(4 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Amazon and the Burden of Great Expectations

WSJ.com: WSJD - TechnologyJun-12-2021, 04:00:00 GMT

It proved to be a near-fatal mistake for their small business, which is the family's sole source of income, and employs all four of the couple's teenage and adult children, as well as Mrs. Wilsondebriano's sister and mother. In the Amazon AMZN -0.08% era, selling online is one thing, but actually getting products to customers fast enough to make them happy is something else. It's especially difficult if, like the Wilsondebrianos, a merchant isn't selling via Amazon, but still feels obligated to match the e-commerce giant's promises of free and fast delivery. Their sales plummeted, from upward of $20,000 a month to as little as $3,000 a month, Mrs. Wilsondebriano says. The family had no choice but to pack and ship orders themselves, since they could no longer afford the third-party shipper they had been using.

amazon, warehouse, wilsondebriano, (13 more...)

WSJ.com: WSJD - Technology

Industry:

Transportation > Freight & Logistics Services (1.00)
Retail > Online (0.68)

Technology:

Information Technology > Artificial Intelligence (0.49)
Information Technology > e-Commerce (0.37)

Add feedback

Why data quality is key to successful ML Ops

#artificialintelligenceOct-1-2020, 01:25:34 GMT

Machine learning has been, and will continue to be, one of the biggest topics in data for the foreseeable future. And while we in the data community are all still riding the high of discovering and tuning predictive algorithms that can tell us whether a picture shows a dog or a blueberry muffin, we're also beginning to realize that ML isn't just a magic wand you can wave at a pile of data to quickly get insightful, reliable results. Instead, we are starting to treat ML like other software engineering disciplines that require processes and tooling to ensure seamless workflows and reliable outputs. "Poor data quality is Enemy #1 to the widespread, profitable use of machine learning, and for this reason, the growth of machine learning increases the importance of data cleansing and preparation. The quality demands of machine learning are steep, and bad data can backfire twice -- first when training predictive models and second in the new data used by that model to inform future decisions."

artificial intelligence, data quality, machine learning, (14 more...)

#artificialintelligence

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Great Expectations: Big Data and Laplace

#artificialintelligenceApr-29-2016, 08:15:24 GMT

Scientific determinism as first published by Laplace in 1814 is an important and essential principle in the macro-world around us. We know that if we push something, it will move -- unless our impulse was not sufficient to overcome inertia… and so forth. Laplace postulated that if there were an omniscient daemon who knew the precise positions and impulses of each and every particle in a system, this daemon would be able to deterministically calculate each and every future state of this system. Our beloved spreadsheet calculations resemble this daemon (possibly in more than one connotation). Typing in some basic data to start calculations from, the wonderous spreadsheet software will automagically calulate everything depending on them, eventually deriving the results we wanted to obtain.

artificial intelligence, big data, data mining, (15 more...)

#artificialintelligence

Country: Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.05)

Industry: Media (0.33)

Technology:

Information Technology > Artificial Intelligence (1.00)
Information Technology > Software (0.97)
Information Technology > Data Science > Data Mining > Big Data (0.56)

Add feedback